Rockport
Measuring Non-Adversarial Reproduction of Training Data in Large Language Models
Aerni, Michael, Rando, Javier, Debenedetti, Edoardo, Carlini, Nicholas, Ippolito, Daphne, Tramèr, Florian
Large language models memorize parts of their training data. Memorizing short snippets and facts is required to answer questions about the world and to be fluent in any language. But models have also been shown to reproduce long verbatim sequences of memorized text when prompted by a motivated adversary. In this work, we investigate an intermediate regime of memorization that we call non-adversarial reproduction, where we quantify the overlap between model responses and pretraining data when responding to natural and benign prompts. For a variety of innocuous prompt categories (e.g., writing a letter or a tutorial), we show that up to 15% of the text output by popular conversational language models overlaps with snippets from the Internet. In worst cases, we find generations where 100% of the content can be found exactly online. For the same tasks, we find that human-written text has far less overlap with Internet data. We further study whether prompting strategies can close this reproduction gap between models and humans. While appropriate prompting can reduce non-adversarial reproduction on average, we find that mitigating worst-case reproduction of training data requires stronger defenses -- even for benign interactions.
- North America > United States > California > San Francisco County > San Francisco (0.04)
- North America > United States > New York (0.04)
- North America > United States > Washington (0.04)
- (10 more...)
- Media > Film (1.00)
- Leisure & Entertainment (1.00)
- Health & Medicine (1.00)
- (5 more...)
Multi-view deep learning for reliable post-disaster damage classification
Khajwal, Asim Bashir, Cheng, Chih-Shen, Noshadravan, Arash
This study aims to enable more reliable automated post-disaster building damage classification using artificial intelligence (AI) and multi-view imagery. The current practices and research efforts in adopting AI for post-disaster damage assessment are generally (a) qualitative, lacking refined classification of building damage levels based on standard damage scales, and (b) trained based on aerial or satellite imagery with limited views, which, although indicative, are not completely descriptive of the damage scale. To enable more accurate and reliable automated quantification of damage levels, the present study proposes the use of more comprehensive visual data in the form of multiple ground and aerial views of the buildings. To have such a spatially-aware damage prediction model, a Multi-view Convolution Neural Network (MV-CNN) architecture is used that combines the information from different views of a damaged building. This spatial 3D context damage information will result in more accurate identification of damages and reliable quantification of damage levels. The proposed model is trained and validated on reconnaissance visual dataset containing expert-labeled, geotagged images of the inspected buildings following hurricane Harvey. The developed model demonstrates reasonably good accuracy in predicting the damage levels and can be used to support more informed and reliable AI-assisted disaster management practices.
- North America > United States > Virginia > Fairfax County > Reston (0.04)
- North America > United States > Texas > Brazos County > College Station (0.04)
- North America > United States > Texas > Aransas County > Rockport (0.04)
- (2 more...)
Insurers Are Set to Use Drones to Assess Harvey's Property Damage
Property insurers are preparing to fly dozens of drones over homes and businesses to assess damage in the wake of Tropical Storm Harvey, the first widespread use of unmanned aircraft to size up catastrophe claims. Insurers have been testing drones and using them on a small scale since getting Federal Aviation Administration approval in 2015 to use the technology for U.S. inspections. Drones provide aerial images that can help insurance adjusters inspect buildings faster and more safely, executives say, part of a larger industry effort to speed up time-consuming claims. The storm presents the first opportunity for some of these insurers to test their new fleets on a large scale. Harvey, which made landfall in Texas last week and moved to Louisiana on Wednesday, is estimated to have caused up to $20 billion in insurable damage.
- North America > United States > Louisiana (0.25)
- North America > United States > Texas > Nueces County > Corpus Christi (0.05)
- North America > United States > Texas > Aransas County > Rockport (0.05)
- Banking & Finance > Insurance (1.00)
- Government > Regional Government > North America Government > United States Government (0.52)